Dense Feature Fusion for Online Mutual Knowledge Distillation

نویسندگان

چکیده

Abstract Feature maps contain rich information about image intensity and spatial correlation. However, previous online knowledge distillation methods only utilize the class probabilities, Ignoring middle-level supervision, resulting in low efficiency training many models. Even if some have joined way is to define characteristic loss, effect general. We propose a new method of through fusion features between teacher academic network enter supervision. The specific fuse network, establish an auxiliary branch process information, so that feature can effectively strengthen interaction network. At same time, we added normalized integrated output, our reached SOTA KD. done lot experiments on cifar-10, cifar-100 ImageNet datasets, proved this more effective than other performance sub classifier, as well generating meaningful maps.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effective Online Knowledge Graph Fusion

Recently, Web search engines have empowered their search with knowledge graphs to satisfy increasing demands of complex information needs about entities. Each engine offers an online knowledge graph service to display highly relevant information about the query entity in form of a structured summary called knowledge card. The cards from different engines might be complementary. Therefore, it is...

متن کامل

Entanglement of Distillation and Conditional Mutual Information

In previous papers, we expressed the Entanglement of Formation in terms of Conditional Mutual Information (CMI). In this brief paper, we express the Entanglement of Distillation in terms of CMI.

متن کامل

Sequence-Level Knowledge Distillation

Neural machine translation (NMT) offers a novel alternative formulation of translation that is potentially simpler than statistical approaches. However to reach competitive performance, NMT models need to be exceedingly large. In this paper we consider applying knowledge distillation approaches (Bucila et al., 2006; Hinton et al., 2015) that have proven successful for reducing the size of neura...

متن کامل

Knowledge Distillation for Bilingual Dictionary Induction

Leveraging zero-shot learning to learn mapping functions between vector spaces of different languages is a promising approach to bilingual dictionary induction. However, methods using this approach have not yet achieved high accuracy on the task. In this paper, we propose a bridging approach, where our main contribution is a knowledge distillation training objective. As teachers, rich resource ...

متن کامل

Feature Weight Driven Interactive Mutual Information Modeling for Heterogeneous Bio-Signal Fusion to Estimate Mental Workload

Many people suffer from high mental workload which may threaten human health and cause serious accidents. Mental workload estimation is especially important for particular people such as pilots, soldiers, crew and surgeons to guarantee the safety and security. Different physiological signals have been used to estimate mental workload based on the n-back task which is capable of inducing differe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of physics

سال: 2021

ISSN: ['0022-3700', '1747-3721', '0368-3508', '1747-3713']

DOI: https://doi.org/10.1088/1742-6596/1865/4/042084